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AIDING VISUAL SEARCH IN A LIST OF LEARNABLE SPEECH COMMANDS 

BACKGROUND 

Field of the Invention 

[0001] The present invention relates to the field of speech recognition and, more 
particularly, to speech-based user interfaces. 

Description of the Related Art 

[0002] Speech recognition is the process by which an acoustic signal is converted to 
a set of text words by a computer. These recognized words may then be used in a 
variety of computer software applications for purposes such as document preparation, 
data entry and command and control. Speech recognition is generally a difficult 
problem due to the wide variety pronunciations, individual accents and speech 
characteristics of individual speakers. For example, the speaker may speak very rapidly 
or softly, slur words or mumble. When transcribing speech dictation, this may result in: 
spoken words being converted into different words ("hold" recognized as "old"); 
improperly conjoined spoken words ("to the" recognized as "tooth"); and spoken words 
recognized as homonyms ("boar" instead "bore"). However, when controlling and 
navigating through speech-enabled applications by voice, incorrect recognition or non- 
recognition typically results in the execution of unintended commands or no command 
at all. 

[0003] To rectify incorrectly recognized voice commands, conventional speech 
recognition systems include a user-initiated interface or window containing a list of 
possible commands. The list may be a listing of the entire speech command 
vocabulary, or a partial listing constrained by acoustic, language or context modeling 
techniques known in the art. The constrained lists are much more user friendly, since 
the speaker does not have to read through a lengthy list to find an intended command. 
These constrained lists can be generated, for example, by executing an algorithm, as is 
known in the art, one much like a spell checking program in word processing 
applications, to search a command grammar for words with similar characteristics as 
the incorrectly recognized words. Once the list is generated, the user may select the 
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intended command by voice or input device. Alternatively, the user may key in the 
desired command in a text field within the user interface. 

[0004] One of the problems with a user-initiated interface or window containing a list 
of possible commands is that the command list does not change as users become more 
familiar with certain commands. However, for less commonly-used commands, the user 
must still view all of the possible commands, including the commonly-used commands 
with which the user is familiar. Such is the case with respect to both corrective types 
graphical user interfaces ("GUI") as well as so called "What-Can-I-Say" dialog boxes or 
interfaces. 

[0005] Accordingly, it would be beneficial to adjust the command list, thereby 
highlighting less commonly-used commands and/or reducing the salience of more 
commonly-used commands. This would reduce the visual search time for a user, 
thereby increasing the user's efficiency and/or speed of using the speech recognition 
system. 
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SUMMARY OF THE INVENTION 

[0006] The present invention provides a method, a system, and an apparatus for 
aiding a visual search in a list of learnable speech commands. More specifically, the 
present invention is capable of making less commonly-used commands more salient 
and more commonly-used commands less salient. The present invention takes into 
account the evidence that a user has learned or memorized the more commonly-used 
commands, thereby reducing the visual search time needed by a user. 
[0007] In general, the present invention provides a method of aiding a visual search 
in a list of learnable speech commands. The system presents a list of commands for a 
graphical user interface and monitors for a user to select a voice command from the 
command list. Once the user has spoken a command, the system updates the 
command measurements in the database and then compares the updated command 
measurements with the criteria set forth for adjusting the graphical user interface. If the 
updated command measurements do not satisfy the criteria set forth for adjusting the 
graphical user interface, then the display of the command is not adjusted. However, if 
the updated command measurements do satisfy the criteria set forth for adjusting the 
graphical user interface, then the display of the command is adjusted. 
[0008] More particularly, in one embodiment, the present invention provides a 
method for aiding a visual search in a list of learnable speech commands including the 
steps of presenting a display list of commands to a user; monitoring whether the user 
has selected a command; measuring an evidentiary value; comparing the evidentiary 
value to a programmed value to determine if an adjustment criteria has been satisfied; 
and adjusting the display of the selected command. 

[0009] More particularly, in another embodiment, the present invention provides a 
machine-readable storage having stored thereon, a computer program having a plurality 
of code sections, said code sections executable by a machine for causing the machine 
to perform the steps of presenting a display list of commands to a user; monitoring 
whether the user has selected a command; measuring an evidentiary value; comparing 
the evidentiary value to a programmed value to determine if an adjustment criteria has 
been satisfied; and adjusting the display of the selected command. 
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[0010] In yet another embodiment, the present invention provides a system for aiding 
a visual search in a list of leamable speech commands including means for presenting a 
display list of commands to a user; means for monitoring whether the user has selected 
a command; means for measuring an evidentiary value; means for comparing the 
evidentiary value to a programmed value to determine if an adjustment criteria has been 
satisfied; and means for adjusting the display of the selected command. 
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BRIEF DESCRIPTION OF THE DRAWINGS 
[0011] There are shown in the drawings, embodiments which are presently 
preferred, it being understood, however, that the invention is not limited to the precise 
arrangements and instrumentalities shown. 

[0012] FIG. 1 is an example of a graphical user interface prior to the method of the 
present invention and FIG 1a is an example of a graphical user interface after a period 
of time has elapsed. 

[0013] FIG. 2 is a flow chart illustrating a method for aiding a visual search in 
accordance with the inventive arrangements disclosed herein. 



{WP155622;2} 



Page 6 of 16 



Docket No. BOC9-2003-0067 (438) 

DETAILED DESCRIPTION OF THE INVENTION 
[0014] The present invention provides a method, a system, and an apparatus for 
aiding a visual search in a list of learnable speech commands. More specifically, the 
present invention makes less commonly-used commands more salient by taking into 
account evidence that a user has learned or memorized more commonly-used 
commands. These more commonly-used commands are made less salient such that 
the unlearned commands are easier to find, thereby reducing the visual search time 
needed by a user. 

[0015] Speech recognition systems provide users with many different speech 
commands for navigating through the system. These command lists remind users of 
the commands the user can (or should) use in a speech recognition system. As users 
become more familiar with certain commands, they will use them automatically and will 
only need to review a command list for the commands that have yet to be memorized. 
Many different types of command lists may be displayed. In one embodiment, the 
command list may be a matrix of code words to support voice spelling. In another 
embodiment, the speech recognition system may present a display of user help 
commands, such as any listing or GUI displaying a listing of speech recognition 
software commands. This may include "What-Can-I-Say" interfaces, as well as others. 
[0016] The present invention tracks measurements of user performance with 
commands over time. The present invention then uses these measurements to 
determine the likelihood that a user has become more familiar with a particular 
command. As the user becomes more familiar with a command, and with the 
measurement tracking providing evidence of this familiarity, the present invention 
gradually makes the more familiar command less and less salient on the command list. 
As a result, the less familiar commands become more salient, i.e. they stand out, on the 
list, thereby enabling a user to more quickly locate the command, thereby reducing the 
time needed for the user to execute a command. 

[0017] The present invention may encompass any method by which more 
commonly-used commands are made less salient. In one embodiment, if the evidence 
of user memorization of a command is tracked over time, the command or commands 
may gradually be grayed or lightened (from black in a black and white display) such that 
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the less commonly-used commands "stand out" in relation to the more commonly-used 
commands. In this embodiment, the more commonly-used commands may remain 
grayed or lightened, or they may be removed entirely, such as by placing these 
commands in an inactive location that the user may refer to in the event the user forgets 
a command that was once previously memorized. 

[0018] In an alternative embodiment, the present invention may also operate by 
making less commonly-used commands more bold, again such that the less commonly- 
used commands stand out in relation to the more commonly-used commands. 
[0019] In yet another embodiment, the present invention may also rank the 
commands in an order based upon the likelihood that the commands have been 
memorized. For example, as the evidence demonstrates that a command has become 
memorized, the command can be moved down a list of commands, thereby causing the 
less commonly-used commands to be listed first, such that the user may quickly identify 
a command that the evidence demonstrates the user has not learned or memorized. 
[0020] In still another embodiment, the system may readjust the salience of a 
command if the evidence demonstrates that a user has forgotten a command that the 
user appeared to have learned. For example, a command may gradually become 
grayed or lightened if measured values were less than programmed values. However, if 
the evidence demonstrates that the measured value is greater than the programmed 
value, the command may gradually be blackened or darkened, thereby making the 
command more salient. 

[0021] The present invention increases the salience of less commonly-used 
commands and/or reduces the salience of more commonly-used (or memorized) 
commands by tracking measurements of user performance with these commands over 
time. Figures 1 and 1a present an example of the present invention in relation to a 
voice spelling interface for a personal data assistant (PDA). As a user gains experience 
with the code words used in the voice spelling interface, the user will likely use those 
code words more quickly, i.e. less time will elapse from the end of the utterance of the 
previous code word to the beginning of the utterance of the current word. By comparing 
this length of time to a programmed value, the salience of the current word may then be 
adjusted. The programmed value may be any value deemed to be a "normal" value, 



{WP1 55622;2} 



Page 8 of 16 



Docket No. BOC9-2003-0067 (438) 



such as the measurement of expert users engaged in voice spelling. If the measured 
time is less than the programmed value, this provides evidence that a user has become 
more familiar with the command and may have learned or memorized the command. 
As a result, the system operates to make this command less salient. Over time, as the 
evidence continues to show a measured time less than the programmed value, the 
command will become even less salient. 

[0022] Conversely, if the measured time is equal to or greater than the programmed 
value, then the evidence demonstrates that a user is not as familiar with the command 
and the salience of the command remains the same, or may be increased. 
Nevertheless, even for those commands that remain the same, since memorized 
commands are being made less salient, those commands that remain the same are still 
effectively becoming more salient and standing out as compared to the memorized 
commands. 

[0023] Figures 1 and 1a provide a visual demonstration of the present invention in 
use. Figure 1 shows the voice spelling interface for a PDA prior to any adjustment. 
Figure 1a shows how the voice spelling interface might appear after several weeks of 
use. As can be seen, those words that correspond to letters that are more commonly 
used, such as the vowels and common consonants as well as common commands, 
have been grayed or lightened while the less commonly-used commands remain black 
or have been grayed only slightly. As such, the less commonly-used commands stand 
out in comparison to the more commonly-used commands, thereby enabling a user to 
more quickly identify the word necessary to be used for less commonly-used and/or 
unfamiliar letters. 

[0024] Those skilled in the art will recognize that the present invention may be used 
with any of a variety of command lists and with any variety of values other than a time 
value. Accordingly, although a spelling command list and a time value have been used 
for purposes of illustration, the present invention is not so limited to the particular 
examples and embodiments as disclosed herein. In particular, the measurement value 
and the programmed value may be any value that provides evidence to the system that 
the user has become familiar with and/or memorized a particular command. As such, 
the system measures this evidentiary value and uses the evidentiary value to determine 
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whether the selected command should be adjusted. As used herein, an "evidentiary 
value" is a measured value that provides evidence to the system that the user has 
become familiar with and/or memorized a particular command. In the preceding 
example, the evidentiary value was the time that elapsed from the end of the utterance 
of the previous command to the beginning of the utterance of the selected command. 
[0025] Besides time values, another value that may be used as an evidentiary value 
in the present invention is a frequency value. A frequency value is a value of the 
frequency with which the user uses a command. The higher the frequency relative to 
other commands, the greater the likelihood that a user has memorized the command. 
[0026] Once the evidentiary value has been measured, it is compared to a 
programmed value to determine if an adjustment criteria has been met or satisfied. As 
used herein, an "adjustment criteria" is a criteria set forth in the system such that, if the 
adjustment criteria is satisfied, the system will operate to adjust the display list of 
commands whereas if the criteria is not satisfied, the system will not adjust the display 
list of commands. The adjustment may be to the selected command, the remaining 
commands or both. The criteria will generally be a comparison of the measured value 
to the programmed value and the adjustment criteria may be satisfied if the measured 
value exceeds the programmed value or the adjustment criteria may be satisfied if the 
measured value is less than the programmed value, depending on which evidentiary 
value is used. 

[0027] Figure 2 is a flow chart illustrating a method 200 for aiding a visual search in 
accordance with one embodiment of the present invention. The method 200 may begin 
in step 205 where the system presents a list of commands within a graphical user 
interface. In step 210, the system monitors for a user to speak a voice command. The 
command may be any speech command used in a graphical user interface, such as a 
spelling command in a voice spelling interface or a help request in a user help interface. 
[0028] Once the user has spoken a command, the system updates the command 
measurements in the database in step 215. The system, in step 220, then compares 
the updated command measurements with the criteria set forth for adjusting the 
graphical user interface. If the updated command measurements do not satisfy the 
criteria set forth for adjusting the graphical user interface, then the display of the 
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command is not adjusted. However, if the updated command measurements do satisfy 
the criteria set forth for adjusting the graphical user interface, then the display of the 
command is adjusted in step 225. In either instance, the system returns to step 230 
and continues monitoring for spoken commands unless the user terminates the 
program. 

[0029] The present invention can be realized in hardware, software, or a combination 
of hardware and software. The present invention can be realized in a centralized 
fashion in one computer system, or in a distributed fashion where different elements are 
spread across several interconnected computer systems. Any kind of computer system 
or other apparatus adapted for carrying out the methods described herein is suited. A 
typical combination of hardware and software can be a general purpose computer 
system with a computer program that, when being loaded and executed, controls the 
computer system such that it carries out the methods described herein. 
[0030] The present invention also can be embedded in a computer program product, 
which comprises all the features enabling the implementation of the methods described 
herein, and which when loaded in a computer system is able to carry out these 
methods. Computer program in the present context means any expression, in any 
language, code or notation, of a set of instructions intended to cause a system having 
an information processing capability to perform a particular function either directly or 
after either or both of the following: a) conversion to another language, code or 
notation; b) reproduction in a different material form. 

[0031] This invention can be embodied in other forms without departing from the 
spirit or essential attributes thereof. Accordingly, reference should be made to the 
following claims, rather than to the foregoing specification, as indicating the scope of the 
invention. 
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